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REMARKS 

Claims 1-25 remain in the application. The applicant has amended independent claim 1, 
12, 19 and 22 to more clearly define the invention. No new matter is added. In view of the 
amendment and the foUovsdng argxmients, the applicant respectfully submits that the pending 
claims 1-25 are now in condition for allowance. 

In the Office Action dated September 6, 2005, all claims were rejected under 35 U.S.C. 
102(b) and 103(a), in view of a number of references. In the Office Action dated June 5, 2006, 
those rejections were determined by the Examiner to be moot in view of a new rejection under 
35 U.S.C. §102(b) and §103(a). As detailed below, there is no proper basis for the new 
rejections. It is understood that the §102 and §103 rejections fi-om the Office Action dated 
September 6, 2005, have been reconsidered and withdrawn. 

To more clearly obviate the rejections of the September 6, 2005 Action, claims 1, 12, 19 
and 22, the only independent claims in the application, have each been amended to define an 
input signal characterized by a frequency spectrum and define a linear transducer which 
produces an output vibration characterized by a frequency spectrum which is a substantially 
linear function of the input signal, whereby the frequency spectrum of said vibration matches 
said frequency spectrum of said input signal. None of the cited references teach or disclose this 
feature of the invention as claimed. 

In the Office Action dated June 5, 2006, claims 1-5, 7-15, and 17-25 were rejected under 
35 U.S.C. § 102(b) as being anticipated by Kenneth Houston, "Development of Sound Source 
Components for a New Electrolarynx Speech Prosthesis," September 1998, The Charles Stark 
Draper Laboratory, Inc. (the "Kenneth Houston" reference), set forth in Exhibit A, and claims 6 
and 16 were rejected under 35 U.S.C. §103(a) as unpatentable over the Kenneth Houston 
reference in view of U.S. Patent No. 5,128,905 (Amatt). 

In the Action, the Examiner cited the Kenneth Houston reference as the basis for the §102 
and §103 rejections, relying on the undersigned's remarks on page 10, second paragraph of the 
Response to Office Action dated March 3, 2006. Regrettably, those remarks were is in error. 
Those remarks were single space small font intended to read as follows: 

"...Even if Espy- Wilson can claim basis for a filing based on the provisional application upon which 
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priority is claimed, September 3, 1999, the applicant can demonstrate prior invention based on a paper published in 
September 1998, internally, by the assignee of the present application, and submitted to the IEEE International 
Conference on Acoustics, Speech and Signal Processing in March 1999. ..." 

In fact, the Kenneth Houston reference was not "published" in a §102 or §103 sense (but 
rather was distributed internally within the assignee) until it was submitted to the IEEE in March 
1999. Based on that date, the Kenneth Houston reference is not a proper basis for a §102 or 
§103 rejection, since the subject application was filed December 30, 1999. Accordingly, the 
outstanding §102 or §103 rejections should be reconsidered and withdrawn. 

Conclusion 

For the reasons discussed above, and for the reasons set forth in the March 3, 2006, 
Response to Office Action, the applicant respectfully submits that claims 1-25 are patentable 
over the cited references, whether considered alone or in combmation, and respectfully request 
reconsideration and withdrawal of the rejections to these claims under 35 U.S.C. 102(b) and 
1 03 (a). If a telephone conference v^U expedite prosecution of the application the Examiner is 
invited to telephone the undersigned. 

No additional costs are believed to be due in connection v^th the filing of this paper. 
However, the Commissioner is hereby authorized to charge any additional fees, or credit any 
overpayment, to our Deposit Account No. 50-2678. 



Respectfully submitted, 




Greenberg Traurig, LIP 
Customer Np. 35893 



One International Place 
Boston, MA 021 10 

Tel: (617) 310-6000, Fax: (617) 310-6001 
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ABSTRACT 

^OT many individuals who lose their voices due to laryngeal 
cancer or trauma, the only option for speech is to use an 
electrolarynx (EL), which is a battery-powered vibrator that is 
held to the throat. Current devices produce speech that is very 
machine-like in soimd, with low levels of loudness and 
intelligibility, that also draws undesired attention to the user. A 
project at Draper Laboratory, the Mass. Eye and Ear Infirmary 
and MIT aims to develop a much improved £L called the 
Electrolarynx Communication System (ELCS), which is a DSP- 
' based device consisting of sound source, control, and speech 
enhancement subsystem's or modules. This paper introduces the 
ELCS and discusses developments to date in the sound source 
^module. Specific topics include the design of a new linear EL 
transducer and investigations into glottal waveform synthesis 
which should result in a much more natural speech output 

1. INTRODUCTION 

Every year in the United States alone, thousands of people lose 
the ability to produce voice and speech because of laryngeal 
cancer or trauma. For many of these individuals, the only option 
for speech is to use an electrolarynx (EL), which is a battery- 
operated vibrator that is held against the throat. Unfortunately, 
current devices produce speech that is very machine-like in 
sound, with low levels of loudness and reduced intelligibility, 
that also draws undesired attention to the user. Draper 
Laboratory is involved in a collaborative effort v^th the 
Massachusetts Eye and Ear Infirmary (MEEI) and MIT called the 
Voice Project of the W. M. Keck Neural Prosthesis Research 
Center. The aim is to design a new DSP-based EL called the 
Electrolarynx Communication System (ELCS) which should 
offer many improvements in sound quality over previous models. 

As Figure 1 indicates, the ELCS has three subsystems or 
modules: 1) The Sound Source Module consists of a waveform 
generator, power amplifier and a linear shaker transducer, and 
represents the complete functionality of current ELs. 2) The 
Sound Source Control Module provides pitch and amplitude 
control to the sound source based upon neural inputs. We 
envision that when the larynx is removed in the future, the 
severed laryngeal nerve will be transposed (implanted) into a 
str^ muscle near the skin surface. Once the nerve regenerates, 
the muscle will act as an amplifier of neural signals. 
Electromyographic (EMG) electrodes on the skin surface will 
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Figure 1. Block diagram of the new Electrolarynx 
Communication System. 

detect the redirected laryngeal nerve activity and control signals 
will be derived, hopefully much in the same way that control 
signals for prosthetic limbs are obtained today. 3) The Speech 
Enhancement Module is a real-time enhancement system to 
further improve the output quality. At a minimum, it will 
perform low-frequency emphasis and amplification to a 
loudspeaker when used in noisy environments. Many other 
improvements are possible, such as the correction for speech 
distortions caused by alterations to the vocal tract by the 
laryngectomy operation. 

2. LINEAR TRANSDUCER DESIGN 

Figure 2 shows a non-linear transducer which is representative of 
current EL designs. An armature pulsating at the pitch frequency 
is made to strike a coupler which is held to the throat. The 
coupler conducts the impulses into the pharynx. The coupler's 
mechanical characteristics control many aspects of the resulting 
speech spectrum. Non-linear transducers inherently limit EL 
designs in the following ways: 1) There is generally a low- 
fi-equency deficit below approximately 500 Hz which makes 
certain vowels hard to distinguish, 2) the spectral envelope is 
difficult to control, 3) there is a very high level of self-noise, 
which represents a constant interference to the desired signal, 
filling in spectral and temporal "valleys" where sound should be 
absent, and 4) there is a lack of variation in the harmonic 
structure, giving the sound a metallic and machine-like quality. 
Developing a linear transducer for an EL is critical because this 
allows use of arbitrary driving waveforms. Purely electronic 
waveform synthesis allows for rapid responses to control inputs, 
permits adjustment of the spectrum as desired, and enables 
inclusion of features which improve the naturalness of the 
resulting sound. 
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Figure 2. Representative non-linear transducer used in 
current electrolarynx designs. An armature pulsating at 
the fundamental pitch frequency is caused to strike a 
coupler disk which is held to the neck. The spectral 
characteristics output speech are determined by the 
mechanical characteristics of the coupler assembly. 
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Figure 3. Notional view of linear EL transducer under 
developments which draws heavily upon loudspeaker 
technology, TTie coupler disk which is held to the neck is 
attached to the voice coil cylinder. The linear nature of 
the device allows use of electronic waveform synthesis 
which should result in substantially in^>roved sound 
quality. 

Figure 3 diagrams the new linear transducer. Owing to the 
similarity to moving-coil loudspeaker technology, a loudspeaker 
manufacturer is ^ricating initial prototypes at the time of this 
writing, Rgure 4 shows equivalent circuits for the transducer 
[1]. Like a loudspeaker, an electromechanical model defines a 
motor constant <|>m=BL that transforms between electrical to 
mechanical domains using Force-VoltageA^elocity-Current 
analogies. Unlike a loudspeaker, the neck mechanical impedance 
represents the load rather than acoustic radiation alone. Odeally, 
acoustic radiation results only when the vibrating pharynx wall 
interacts with air inside the throat to set up a sound wave - the 
resulting volume velocity should replicate a normal glottal 
source. The additional loading due to acoustic radiation is small 
relative to the neck impedance and can be ignored.) 



J jiL-viJ^ — 





Amwtur*, Coupl*r A Suspwision 



Figure 4. (top) Linear transducer electro-mechanical 
equivalent circuit diagram. The mechanical impedance 
of the neck serves as the load to the device, (bottom) 
Purely electrical equivalent circuit. 
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Figure 5, System for the measurement of the neck 
mechanical impedance. The impedance head is a device 
which simultaneously measures force and acceleration. 
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Figure 6. (top) Representative plot of real part of 
Force/Acceleration ratio, measured and best-fit model 
(mt - Si/O)^). (bottom) Imaginary part of Force 
/Acceleration ratio, measured and model (- j RmL^<o). 
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Observed Range 


Design 




Parameter 


Min 


Max 


Value 


Units 


Load Mass niL 


1.1 


1,9 


1.8 


grams 


Mechanical 
Resistance Rmt 


8 


19 


16 


N-s/m 


Spring Constant Sl 


1.5 


8 


3.0 


N/mm 



Table 1. Summary of estimated neck mechanical 
parameters for limited test of 7 subjects (including 4 
laryngectomees). The "design value" column represents 
nominal values used in the transducer design. 
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Figure 7. Expected transducer frequency response for a 
2.63 Vims swept sinusoid under nominal load 
conditions. 

In order to piopcriy specify the load so that the transducer could 
be designed, measuiements were conducted on a very Umited set 
of subjects - 3 male larygnectomees, 1 female laryngectomee, and 
3 male non-larygnectomees. Rguie 5 shows the test setup. An 
electrodynamic shaker was driven with white noise into a coupler 
which was placed against the subject's throat An impedmce 
head sensor measured axial force and acceleration. The transfer 
function gives the apparent mass MlQg)), which for a series 
Mass-Resistance-Spring combination equals 

MlOco) = Force/Acceleration = mL - SJa? - j Rmiy© , 

where niL = mass in kg, R^l = mechanical resistance in N-s/m 
(equivalent to kg/sec or "mechanical ohms"), and Sl = spring 
constant in N/m (sometimes specified as the compliance 
CroL=l/SL). The mechanical impedance ZnjL(jco) is the ratio of 
force to velocity, so 

Z^Ow)= ForceA^elocity= jcoMtO®) = RmL+j(©niL - S/(0) N-s/m. 

Figure 6 shows a representative plot of the real and imaginary 
parts of the measured transfer function for MtCjo)), and best-fit 
curves. It may be seen that the first-order series mass-spring- 



resistance model provides a reasonable fit. Table 1 summarizes 
the measured parameters, which should be valid over 
approximately 50-2000 Hz. It is interesting to note thai the 
moving mass in the neck is in the 1-2 gram range, or less than the 
mass of a US penny (2.6 grams). No significant differences 
between laryngectomees and non-laryngectomees were noted in 
our limited sample. The expected nominal velocity frequency 
response for a 2.63 Vrms swept sinusoid excitation is shown in 
Figure 7. The device should have a flat response over 20-2000 
Hz, so the full audio band can be realized (subject to power 
budget limitations at low frequencies and appropriate 
equalization at high frequencies). With the indicated velocity 
(-17.3 dB re 1 m/sec or 0.14 m/sec rms), speech outputs of 
approximately 85 dBA are expected. 

3. WAVEFORM GENERATOR 

As mentioned above, it is desired to set up a sound wave within 
the pharynx which closely matches a normal glottal excitation. 
The user simultaneously manipulates the vocal tract in the same 
way as in normal speech to produce a speech output at the lips. 
The waveform generator should therefore produce some 
approximation of a glottal source waveform, appropriately 
compensated for distortions introduced by the transduction 
process. 

The literature is replete with glottal waveform models [2], An 
early model is the Rosenberg model [3], shown in Figure 8. 
When such a waveform is played through a linear EL transducer, 
the resulting sound is somewhat better than a conventional EL, 
but is still highly objectionable: the sound is metallic and 
machine-like. This is primarily because the waveform is defined 
over a single cycle and repeated, and as a consequence, all 
harmonics are in lock-step with the fundamental. Other glottal 
models with more sophisticated parameterization of a single 
cycle suffer from the same problem, even if noise is added or the 
"arrival times" of the impulses are dithered 

To obtain a rich, natural sound (whether synthesizing voice or 
musical instruments), a proper harmonic structure is required 
where the overtones drift in frequency relative to the fundamental 
[4]. A simple way to capmre the harmonic structure is record a 
voice and inverse filter, as shown in Hgure 9. This is analogous 
to waveform sampling in musical instrument synthesis (known to 
produce high quality results), except in the case of voice, the 
effect of the vocal tract must be removed. A held vowel sound 
(such as /e/ in **bet") is recorded for several seconds, and is 
subsequently LPC-analyzed using a high order filter (Ns41) and 
inverse filtered to obtain a whitened residual. Pitch variations 
are then smoothed through interpolation and a low pass filter (- 
12 dB/octave) is applied. An example of the result is shown in 
Figure 10. As can be seen, while similar to Figure 8, there is 
considerable irregularity from cycle to cycle. When the inverse 
filtered waveform is applied to the waveform generator in Figure 
11 and a linear transducer, the metallic quality completely 
disappears, and the speech in fact retains many of the qualities of 
the original speaker. Note that the table of glottal samples must 
be of a certain minimum length (>2 seconds), or else the 
periodicity associated with the table length is quite noticeable. 
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Figure 8. Displacement, velocity and acceleration curves 
for an EL excitation based upon the Rosenberg glottal 
pulse type C [3], scaled to 1 g rms acceleration. This 
excitation results in a very unnatural, metallic speech 
quality. 
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Figure 9. Block diagram of inverse filtering procedure to 
create lookup table for waveform generation. 
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Figure 10. Displacement, velocity and acceleration 
curves for an EL excitation based upon inverse filtering 
of a recorded vowel, also scaled to 1 g rms acceleration. 
This excitation sounds much more natural, and even 
retains qualities of the original speaker. Time scale is 
extended to emphasize non-stationary nature. 
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Figure 11. Block diagram of waveform generator for the 
new Electrolarynx Communication System. 



The inverse filtering approach has interesting implications. If the 
user were to have a voice recording taken before the 
laryngectomy operation (hopefully well in advance before 
disease affects the voice), the EL could be customized to that 
voice. The user could therefore maintain some degree of 
individuality in the voice and hence reduce some of the hardship 
currently endured. Alternatively, the voice of a close relative 
might be adapted, or the user might select from a catalog of 
voices. 

4. CONCLUSIONS 

Considerable progress has been made in the design of source 
module components for tiie ELCS, which should enable the 
construction of a source-only EL prototype in the near future. 
This alone should offer a significant improvement over current 
EL devices. Work on the other modules is progressing. Of 
particular note is that a first human trial of a laryngeal nerve 
transposition recentiy took place at MEEl. which should enable 
work to commence on the processing of EMG signals to obtain 
pitch and amplitude control. If reliable control signals can be 
obtained, even more significant improvements to speech quality 
should be possible. 
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